A Survey on Document Clustering For Identifying Criminal
نویسندگان
چکیده
Crimes are a social nuisance and cost our society dearly in several ways. Crime investigation has very significant role of police system in any country. Developing a good crime analysis tool to identify crime patterns quickly and efficiently for future crime pattern detection is required. This paper presents combine approach of clustering, outlier detection and providing the rule engine to identify the criminals. Data mining is the computer-assisted process to break up through and analysing large amount of data. Then extracting the meaning of the data. It is also the process of analysing data from different perspectives and summarizing it into useful information. Data mining plays an important role in terms of prediction and analysis. Clustering is the task of grouping a set of objects in such a way that objects in the same groups are more similar to each other than to those in other groups. The law enforcers have to effectively meet out challenges of crime control and maintenance of public order. Hence, creation of data base for crimes and criminals is needed. KeywordsData mining; Clustering; Outlier detection; Rule engine. _________________________________________________*****_________________________________________________
منابع مشابه
A Joint Semantic Vector Representation Model for Text Clustering and Classification
Text clustering and classification are two main tasks of text mining. Feature selection plays the key role in the quality of the clustering and classification results. Although word-based features such as term frequency-inverse document frequency (TF-IDF) vectors have been widely used in different applications, their shortcoming in capturing semantic concepts of text motivated researches to use...
متن کاملخوشهبندی فراابتکاری اسناد فارسی اِکساِماِل مبتنی بر شباهت ساختاری و محتوایی
Due to the increasing number of documents, XML, effectively organize these documents in order to retrieve useful information from them is essential. A possible solution is performed on the clustering of XML documents in order to discover knowledge. Clustering XML documents is a key issue of how to measure the similarity between XML documents. Conventional clustering of text documents using a do...
متن کاملخوشهبندی اسناد مبتنی بر آنتولوژی و رویکرد فازی
Data mining, also known as knowledge discovery in database, is the process to discover unknown knowledge from a large amount of data. Text mining is to apply data mining techniques to extract knowledge from unstructured text. Text clustering is one of important techniques of text mining, which is the unsupervised classification of similar documents into different groups. The most important step...
متن کاملStudy on Feature Selection Methods for Text Mining
Text mining has been employed in a wide range of applications such as text summarisation, text categorization, named entity extraction, and opinion and sentimental analysis. Text classification is the task of assigning predefined categories to free-text documents. That is, it is a supervised learning technique. While in text clustering (sometimes called document clustering) the possible categor...
متن کاملیک مدل موضوعی احتمالاتی مبتنی بر روابط محلّی واژگان در پنجرههای همپوشان
A probabilistic topic model assumes that documents are generated through a process involving topics and then tries to reverse this process, given the documents and extract topics. A topic is usually assumed to be a distribution over words. LDA is one of the first and most popular topic models introduced so far. In the document generation process assumed by LDA, each document is a distribution o...
متن کامل